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Abstract. The Cartesian closed categories have been shown by several authors to provide the 
right framework of the model theory of A-calculus. The second author developed this as a syntactic 
equivalence between two calculi, giving rise to a new kind of combinatory logic: the categorical 
combinatory logic, where computations can be done through simple rewrite rules, and, as usual 
with combinators, avoiding problems with variable name clashes. This paper goes further (though 
not requiring a previous knowledge of categorical combinatory logic) and describes a very simple 
machine where categorical terms can be considered as code acting on a graph of values (the 
essential actions are LISP's "cons", "car" and "cdr", as well as "rplacd" to implement recursion). 
The only saving mechanism is a stack containing pointers on code or on the graph. Abstractions 
are handled in the very same way as in P. Landin's SECD machine, using closures. The machine 
is called categorical abstract machine or CAM. The CAM is easier to grasp and prove than the 
SECD machine. The paper discusses the implementation of a real functional programming 
language, ML, through the CAM. A basic acquaintance with A-calculus is required. 



1. Introduction 

We use the categorical framework developed in a syntactic and computational 
fashion in [8] to define an abstract machine for implementing functional program- 
ming languages with static binding of variables, i.e. A-calculus based languages. 
Our machine, called categorical abstract machine or CAM, may be viewed as a 
synthesis of three different approaches to the implementation of functional program- 
ming languages: 

- De Bruijn's formalism for eliminating problems caused by a-conversions [3], 

- Turner's SK-reduction machine [21], 

- Landin's SECD machine [12, 14]. 

As recalled in more detail below, De Bruijn's trick was to replace variable names 
by numbers recording their binding heights, allowing to get an automatic treatment 
of a-conversions while performing ^-reductions. The point is that De Bruijn's 
notation may be considered as a combinatory logic which is nothing but the second 
author's categorical combinatory logic [12], endowed with rewrite rules of the very 
same kind as the SK-rules of combinatory logic involved in the SK-reduction 
machine. But while the SK-reduction machine sticks to the rewriting mechanism, 
the CAM is a very basic machine: combinatory terms are nothing but sequences of 
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instructions acting on a register and a stack. The CAM handles abstractions as 
closures in the same way as the SECD machine. However, the save-restore mechan- 
isms are quite different in the two machines. 

The paper is composed as follows: the rest of the introduction is devoted to give 
to the reader some feeling about categorical combinators and their 'SK-like' rewrite 
rules. Then in Section 2 we show that our combinatory rules naturally suggest 
machine instructions, and define the core of the CAM. Section 3 sketches correctness 
proofs, which are fully developed in [15]. Section 4 describes extensions to handle 
arithmetic, conditionals and recursion, as well as lazy evaluation which is naturally 
handled in the call-by-value framework by explicitly introducing delaying or 'freez- 
ing' mechanisms. Section 5 describes an ML implementation of the CAM, hence 
giving rise to an ML interpreter written in ML. Section 6 is a discussion. 

We shortly illustrate the different combinatory approaches to evaluation. 
Throughout the paper we shall use an ML-like syntax (GorLCF). Our example is 

let x = plus in x (4,(jc where x = 3)); 

We apologize for the poor interest of the program, but a more involved example 
would obscure the discussion. Expressed as a A-expression, our program looks as 
follows 

M = (Axx(4,(Axx)3)) + 

By doing so we loose some optimization (see Section 2, but we do not care here). 

First we take a look at the classical combinator approach. The expression M is 
compiled into a combinatory term. But combinatory logic does not know about 
products and handles only curry-ed functions. So we have to consider a curry-ed 
addition and write 

N = (Axx4((Axx)3))+ 
The translation into combinatory logic is (without using Turner's optimizations) 

S(S7(K4))(S(K/)(X3))+ 
Now using one or another computation strategy based on the well-known rules 

Sxyz = xz(yz), Kxy = x 
we get 7. Here in an innermost-leftmost sequence: 

S(SI(K4))(S(KI){K3))+ -» (SI(K4)+)(S(KI)(K3)+) 
^(I+)(K4+)(S(KI)(K3)+) 

^+(K4+)(S(KI)(K3)+)^+4(S(KI)(K3)+)^+4((KI+)(K3+)) 

->+4(/(K3+))->-+43-*7. 

Now we turn to the De Bruijn's style. We could have computed M by performing 
/3-reductions, yielding 

M = (Axx(4,(Axx)3))+-> +(4,(Axx)3) -» +(4, 3) 7. 
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But it is well known that /3-conversions give rise to boring problems of name clashes 
(not in our example, which is too simple). For instance (Xxy.x)y by no way reduces 
to ky.y. De Bruijn's idea is to avoid the name clashes by getting rid of the names 
themselves. The only important information about a variable in a closed term (i.e. 
having no free variables) is its binding height, i.e. the number of A's between it and 
(not including) the binding one. Then the variable names are replaced by this 
number (not to be confused with an integer constant), and De Bruijn showed that 
this gives rise to an elegant treatment of jS-reductions. For instance 

P = A v.(Axv.x) y becomes A.(AA.1)0. 

A suitable rephrasing of the )3-rule (see Section 3) yields AA.l, and one never has 
to change explicitly kxy.x into, say kxz.x, to avoid the name clash. 

Actually it was shown in [8] that the De Bruijn's notation not only yields an 
elegant treatment of /3-conversion, but also may be considered as a combinatory 
logic with rules of the very same kind as the SK-rules. 

Since we would like the paper to be self-contained, we try to introduce these 
ideas briefly below (more details may be found in the cited references). We shall 
introduce a semantic setting to motivate our rules. 

First we come back to the normal A -calculus, and try to describe the meanings 
of expressions. They depend on associations of values with identifiers, i.e. on 
environments. Thus M has as meaning a function [M] associating a value with an 
environment. We get the well-known semantic equations, where applying a function 
to its argument is denoted by simple juxtaposition: 

Mp = p(x), 

Mp = c, 

IMNJp = [MJp(|[NJp), 
Ux.Mjpd=lMjp[x^d] 

where p(x) is the value associated with x in p, c is a constant denoting a value, 
also called c following usual practice, p[x*-d] is p where x has been updated with 
value d. 

How should these equations be rephrased when replacing classical A -expressions 
by expressions in De Bruijn's notation? We assume that p has the shape 
(•••((), v n ) ■■■ , v 0 ) where u, is associated with i. We leave the reader convince himself 
that the following is obtained: 

I0](p,d) = d, ln + l](p,d) = Mp, 

Mp = c, 

lMN}p=lM}p(im P ), 



[A.M]pd = [M](p,d). 
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But we are not interested in meanings for themselves, we want meanings to suggest 
computations. Our combinatorial approach stresses that the meaning of, say MN, 
is a combination of the meanings of M and N. We introduce three combinators: S 
of arity 2, A, ', of arity 1, and infinitely many combinators n ! with the intention that 

[«J = «!, M = 'c, !MN]| = S(|[M],[iV]), lA.M] = yl([MI). 
This allows us to transform our semantic equations into purely syntactic ones: 
0\(x,y) = y, (n + l)\(x,y) = n\x, 
('x)y = x, 
S(x,y)z = xz(yz), 
A{x)yz = x(y,z). 

This is not so far from the SK-rules: the first three rules forget about an argument 
just as K does, while the fourth rule is an uncurried form of the S-rule; finally the 
last rule precisely describes currying, i.e. transforming a function with two arguments 
(more precisely one argument which is a couple, see below for a discussion of 
coupling) into a function of its first argument yielding a function of its second 
argument. Hence, roughly speaking, categorical combinatory logic is something like 
"combinatory logic with explicit products". 

This remark directly leads to considering explicit products in the A-calculus, as 
in the subterm (4, (Axx)3) of our example. So we introduce a pairing combinator 
{ ), intending 

KM, JV)J = <I MI, INJ). 

It is also very natural to associate the destructors corresponding to this new com- 
binator, i.e. the projections Fst and Snd. This gives rise to the following equations: 

Fst(x, y) = x, 
Snd(x,y) = y, 
(x,y)z = (xz,yz). 

The last equation relates pairing with coupling: in a set theoretical setting, the 
pair of two functions /: £>>-»£, g:D>-+F is the function h:D<-*ExF which has 
as output the couple of their outputs. 

Notice that what we call here couples are often called pairs (in particular in ML). 
But here we have to distinguish (the couple of /, g above is not a function, it is an 
element in (D=>£) x (D=>F)), and we have no better name for pairing functions. 
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Actually we have introduced coupling earlier in our exposition, when discussing 
the 'shape' of the environment: we have taken the view of a tree (or graph) 
representation of environments, the nodes being coupling nodes. A final remark 
about coupling: we stress that Fst, Snd have arity 0; hence Fst(x, y) has to be read 
Fst applied to the couple of x, y. 

Clearly there is some redundancy in all these rules: the rules about Fst and n + 1 !, 
about Snd and 0!, about ( ,) and S are quite similar: duplication may be avoided 
by introducing a new binary operator, the composition, i.e. the basic tool of category 
theory, and a new constant App. 

Now consider S( , ) and n ! as shorthands for App ° ( , ) and Snd » Fst" (setting 
Fst" +1 = Fst o Fst"). Then we can replace the two tables of combinatory equations 



by the following: 


(ass) 


(x°y)z = x(yz), 


(fst) 


Fst (x, y) = x, 


(snd) 


Snd (x,y) = y, 


(dpair) 


(x, y)z = (xz, yz), 


(ac) 


App(A(x)y, z) = x(y, z), 


(quote) 


Cx)y = x 



(notice that ass relates composition to application as dpair does with pairing and 
coupling). 

We have chosen to make one rule out of the above rules on S, A (notice that the 
rule on S becomes S(x, y)z = App(xz, yz)), to get a more homogeneous treatment 
of our three operators of arity 0: Fst, Snd and App. Notice also that we get as a 
derived rule 

A (x o Snd)yz = (x° Snd)(y, z) = xz (without using quote) 

which is of course implied by quote. We shall use this coding for functional constants 
such as +. The reason of the difference between codings of basic and functional 
constants will appear clearly after the description of the CAM instructions. 

Now it is time to apply our general discussion to an actual computation. Our 
example M becomes M' = S(A(S(Q\,(% S(A(0\), '3)»), A(+ ° Snd)). 

Here is where our discussion gets connected with the SECD machine approach: M' 
is going to be evaluated by applying it to an environment, namely the empty environ- 
ment since our term is closed. 

We evaluate M'(), by an innermost-leftmost strategy. We set 



A = S(0!,('4,B» and B = S(A(0l), '3). 
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Here is the reduction sequence: 

S(A(A), A(+ o Snd))() -> App(A(A)(),A(+ ° Snd)Q) 

^Ap (setting p = ((),A(+<>Snd)())) 

-» ^p(0!p,<'4, B)p) -* App(A(+ o Snd)0, ('4, B)p) 

+ App(A(+°Snd)(),('4p,Bp)) 

^App(A{+oSnd)0A4,Bp)) 

-> App(A(+ o Snd)(), (4, App(A(0\)p, '3p))) 

App(A(+ o Smf)(), (4, App(A(0l)p, 3))) 

-»A/,pU(+oSnd)(),(4,0!(p,3))) 

-* App(A(+ o Snd)(), (4, 3)) -* (+ . 5nd)((), (4, 3)) 

-*+(Snd((),(4,3)))->.+(4,3)-»7. 

The last step is to explain how all these computations can be carried out by an 
abstract machine, which we do in the next section. 

2. The core of the categorical abstract machine 

Before going further, it is time to justify the attribute 'categorical'. Categories are 
a game with composition and identity (the identity will arise to perform an optimiz- 
ation, see below), Cartesian categories add rules about products to the game, 
involving (,) (pairing), Fst, Snd. Cartesian closed categories add to the game A 
(currying) and App, allowing to talk about exponentials, i.e. function spaces. More 
detail would lead us out of our scope and we refer to [8]. 

Our rules evaluate categorical terms like M' above, built with those combinators, 
by applying them to an environment, which is built by coupling its different com- 
ponents. Notice that these two last operators are not present in M'; applying is 
introduced when starting with M'Q, while new couples are created each time dpair, ac 
are used. So we could call them dynamic combinators, while the others, which 
appear at compile time, are static. 

Now we want to suggest that the static operators naturally give rise to very basic 
machine intructions. First we remark that along the rules applied in reducing M'() 
above, the redexes had all the form Mv where v is a value, i.e. a term in normal 
form w.r.t. our rules, and M is (the transformation of) a term in De Bruijn's notation. 
Now consider M as code acting on v. M is made of elementary pieces of code. 

Fst and Snd are easily viewed as instructions: Fst acts on a value (v t , v 2 ) by 
accessing to its first son (supposing the value is represented as a binary tree). So 
this can explain how a variable is going to be evaluated. For the couples, we are 
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concerned with the action of (M, N) on v. The equation about < , > tells us what to 
do: 

<M, N)v = (Mv,Nv). 

So the actions of M, N on v should be carried out independently and then put 
together by building a tree whose root is a couple, and whose sons are the values 
«i , v 2 obtained. In a sequential machine we have to choose to evaluate, say M, first. 
This yields Ui . But we should have stored v before working on it, in order to restore 
it when tackling N, yielding v 2 . Finally we put together u, and v 2 , but this supposes 
again that we have stored u, , which should have been done exactly when we have 
restored v. 

Now we have the structure of our machine: a term (rather a graph in an actual 
implementation), which is a structured value, a code, and a stack (or dump). So a 
state of the machine is a triplet with respective components T, C, S. The discussion 
above suggests to define the code for (M, N) as the following sequence of instruc- 
tions: '(' followed by the sequence of instructions corresponding to the code of M, 
followed by followed by the sequence of instructions corresponding to the code 
of N, followed by ')', where the effects of '<'» "," and ')' are respectively: 

(: push the term onto the top of the stack, 
,: swap the term and the top of the stack, 

): make a couple out of the top of the stack and the term, replace the term 
by the couple just built, and pop the stack. 

Notice that we have not done any compiling at all: The concrete syntax used for 
pairing corresponds exactly to where control instructions have to be inserted to 
combine the evaluations of M, N into the evaluation of (M, JV>. 

Now that we have the structure of the machine we can describe Fst and Snd in 
a more precise way: 

Fst: expects a term (s, t) and replaces it by s, 
Snd: expects a term (s, f) and replaces it by t. 

The code for n! is made of n 'Fst' instructions followed by an 'Snd' instruction. 
Again to get the code, we would have had nothing to do if we had taken x | y (or 
even xy) as the concrete syntax of the composition which we denote by y ° x 
according to the most usual mathematical practice. 

For currying, the code of A{M) is just A(C) where C is the code of M (actually 
its address in a realistic implementation, see Section 5), and the action of 'A' is 

A: replace the term s by C:s where C is in the code encapsulated in A. 

C:s is only a shorthand notation for 'A(M)s\ From the rewrite rule point of view, 
the action of 'A' is none, since A(M)v is a value as soon as v is a value, hence 
may not be rewritten. In terms of actions, this can be rephrased as: the action of 
A(M) on v is A(M)v, whence the description of the command 'A\ which is nothing 
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but building a closure as in the SECD machine. Indeed, as stressed by our notation 
C:s, we handle as a value a couple made of the code corresponding to the body 
of a A-expression, and a value, which is nothing but the declaration environment 
of the function described by the abstraction (see the second example below). 

We continue with A-calculus application, rephrased as App » (M, N). The underly- 
ing rule is now (App ° (M, N))v = App(Mv, Nv). 

Suppose (Mv, Nv) has been evaluated to (v lt v 2 ), which is the job of the code 
associated with (M, N). What remains to be done is the instruction 'App'' which 
expects i>, = A(P)v\ and will perform App{A{P)v\, v 2 ) = P(v\, v 2 ). In terms of our 
machine the code corresponding to App ° (M, N) is the code for (M, N) followed 
by 'App', with the following effect: 

App: expects a term (C:S, t), replaces it by (5, t) and prefixes the rest of the code 
by C. 

We still have the constants to deal with: for basic constants like integers the code 
for 'c is '(c), with the following action: 

': replaces the term by the encapsulated constant. 

For others, which are functions, like the addition, we use the coding suggested in 
the previous section: the code for 'c is A(C) where C is Snd followed by 'c'. 

The description of the core of our machine is now complete. We summarize it 
by Table 1. We give to the instructions a name in relation with their behaviours, so 
that 

Fst Snd { , ) App A 

become 

fst snd push swap cons app cur quote 
and we use semicolons to separate instructions. 

Table 1 

CAM: the A-calculus with explicit products 



Configuration Configuration 



Term 


Code 


Stack 


Term 


Code 


Stack 


uo 


fst;C 


S 


s 


C 


S 


(M) 


snd;C 


S 


t 


C 


S 


5 


(quote c);C 


S 


c 


C 


s 


s 


(curC);Cl 


S 


(C:s) 


CI 


s 


s 


push;C 


S 


s 


C 


s.S 


t 


swap;C 


s.S 


s 


C 


t.S 


t 


cons; C 


s.S 


Ut) 


C 


S 


(C:s, t) 


app; CI 


S 


(s,0 


C;C1 


s 


(m, n) 


plus;C 


S 


m + n 


C 


s 
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Notice that the sequences 'cons; app', 'cons; plus' should be obviously contracted, 
avoiding a useless 'cons', so that 'app', 'plus', would act as true binary instructions, 
taking as arguments the top of the stack and the term. However, we keep hereafter 
the decomposition as it is, because this allows a simpler explanation of a number 
of optimizations. In our real implementation, the optimization is performed when 
expanding CAM code to host machine code. 

We run our example, using the mathematical symbols to stress the relation with 
the rewrite rules used at the end of the previous section (A, B denote the codes 
corresponding to previous section's A, B). 

() (A(A),A(Snd+))App [] 

() A(A),A(Snd+))App [()] 

A:() ,A(Snd+))App [()] 

() A(Snd+))App [A:()] 

Now '+' stands as an abbreviation for (Snd+):() 



+ 


)App 


[A:()] 


(A:(),+) 


App 


[] 


((),+) 


(Snd,('4,B))App 


[] 


((),+) 


Snd,(%B))App 


[((),+)] 


+ 


,<'4, B))App 


[((),+)] 


((),+) 


<'4, B))App 


[+] 


((),+) 


'4, B))App 


[((),+);+] 


4 


, B))App 


[((),+);+] 


((),+) 


B))App 


[4;+] 


((),+) 


A(Snd),'3)App))App 


[((),+); 4; +] 


Snd:((),+) 


, '3) App)) App 


[((),+); 4; +] 


((),+) 


•3)App))App 


[Snd :((),+); 4; +] 


3 


)App))App 


[Snd:((),+);4;+] 


(Snd:((),+),3) 


App))App 


[4; +] 


(((),+), 3) 


Snd))App 


[4;+] 


3 


))App 


[4;+] 


(4,3) 


)App 


[+] 


(+, (4,3)) 


App 


[] 



We expand back our abbreviation: 

(0,(4,3)) Snd+ [] 
(4,3) + [] 

7 [] [] 

The above session demonstrates clearly the interest of our coding of functional 
constants. If we had taken the code '(quote +)' for the constant +, we would have 
been constrained to change the behaviour of 'app', which would have to test whether 
the first component of the term is a closure, or any of the functional constants. The 
chosen coding avoids performing such tests. 
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Now we run a more involved example, where the closure mechanism appears 
clearly, in connection with static binding: 

let x = 5 in let zy = y + x in let x = 1 in (zx) * 2;; 

The corresponding A -expression is 

P = (Ax.(Az.(Ax(zx) * 2)l)(Aj>.v + x))5. 

But actually the first notation is better and leads to a simple compile time optimiz- 
ation, as we explain now. The point is that the code corresponding to (A. M)N is 
'push', followed by 'cur C" (where C is the code of M), followed by 'swap', followed 
by the code CI of TV followed by 'cons' and 'app'. Supposing that evaluating CI 
on term s yields value v, the important steps are: 

5 push;(cur C);swap;Cl;cons; app S 



s (cur C);swap;Cl;cons; app s.S 

C:s swap; Cl;cons; app s.S 

s Cl;cons;app (C:s).S 

v cons; app (C:s).S 

(C:s,v) app 5 

(s, v) C S 



Now the same result is achieved by the following optimized code: 'push' followed 
by the code of TV, followed by 'cons', followed by the code of M, as shown below: 

s push;Cl;cons;C S 

s Cl;cons;C s.S 

v cons;C s.S 

(s, v) C S 

This optimization has to do with the identity combinator, without which our 
combinatory logic would hardly be called 'categorical'! This leads us to a short 
digression; besides being able to evaluate expressions relative to an environment 
as developed here, categorical combinatory logic is able to simulate the ^-reductions 
in a natural way, using a set of rules distinct from the one used here, involving only 
the 'pure' categorical combinators, i.e. excluding coupling and applying. The start- 
point is the following rule: 

(Beta) App ° (A (x), y) = x « (Id, y) 

which allows then to distribute (Id, y) (or whatever it becomes) along the structure 
of x up to the leaves, which are n ! for some n. Then we need the projection rules, 
very similar to the ones involving the couple operator: 

Fst ° (x, y) = x, Snd « <x, y) = y. 

As an illustration we simulate 

Ax(A^.v)jc-» Axx 
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which cannot be simulated neither by the SX-rules, nor by the rules on which our 
machine is based, but rather either by the Curry axioms [13] or by rules of the kind 
of the three rules we have just listed (in contrast to Curry axioms, the categorical 
setting is very intuitive). Here is the simulation: 

A(App ° (A(Snd), Snd))-*A(Snd ° (Id, Snd)) -+ A(Snd). 

Now coming back to our optimization, Beta suggests the following code and 
reduction for let x = N in M: 

s push;skip;swap;Cl;cons;C S 



s skip;swap;Cl;cons;C s.S 

s swap;Cl;cons;C s.S 

s Cl;cons;C s.S 

v cons;C s.S 

(s,v) C S 



where 'skip' stands for Id, with the nonaction effect. But the effect of 
'push: skip; swap' is clearly the effect of 'push', and we have obtained a 'theoretical 
foundation' for the optimization proposed above. 

Another useful optimization is to compile Af + N into: code of (M, N), followed 
by 'plus'. This is justified by 

App o (A(+ o Snd), (M, JV» = + - Snd ° (Id, (M, N» = + - (M, N). 

Here is the evaluation of our term P. We use the notation x\y = y°x to make 
compiling easier to follow; we take as abbreviations 

B = (C, '2>|*, C = (Fst\ Snd, Snd)\App, D = (Snd, Fst | Snd)\+ 

which correspond to the subterms (zx) * 2, zx and y + x. We also contract some steps. 



0 


(•5XA(D)Kl)B 


[] 


(0,5) 


(A(D)K»B 


[] 


(«),5,D:«),S)) 


<'1)B 


t] 


((((), 5), D:((),5)),l) 


B 


[] 


(((0,5), D:((),5)),l) 


C,'2)* 


[((((),5),D:((),5)),1)] 


((((), 5), £>:((), 5)), 1) 


l!,0!)App,'2>* 


[((((), 5), £>:((), 5)), 1); ((((), 5), D:((), 5)), 1)] 


((((). 5), £>:(((), 5)), 1) 


0\)App, 'I)* 


[£>:((), 5); ((((), 5), D:((), 5)), 1)] 


(£>:((), 5), 1) 


App, '2>» 


[((((),5),D:((),5)),1)] 


(((),5),1) 


0!, l!)+,'2>* 


[(((), S),1);(((().S), D:((),5)),l)] 


(((),5),1) 


l!>+,'2>* 


[1;((((),5),D:((),5)),1)] 


(1,5) 


+,'2>* 


[((((),5),D:((),5)),1)] 


((((),5),D:((),5)),1) 


'2>* 


[6] 


(6,2) 


* 


[] 


12 


[] 


[] 



We come to the promised discussion of static binding. Static binding means that 
the free occurrence of x in the definition of z is bound to 5, the value of x when z 
is declared. Dynamic binding would instead amount to bind the same x to whatever 
its value is when the body y + x is evaluated, which would be 1 in our example. 
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To avoid this dynamic capture, the SECD machine had a closure mechanism, 
which is nothing but assigning as value to an abstraction the couple of the abstraction 
and its declaration environment. And this is exactly what our rule about 'cur' does. 
And the reader may check that the instance of that rule creating D:((), 5) expresses 
Landin's closure feature. The complementary feature is Landin's 'apply' instruction, 
which evaluates the body D w.r.t. the environment obtained from the closure and 
from the value last obtained (the argument of the function which the abstraction 
represents). This is exactly what our rule 'app' does. But Landin's 'apply' also entails 
saving mechanisms, which in our setting are carried out by 'push' and 'swap'. 
Actually this is the only significant difference between the CAM and the SECD 
machine. 

For the sake of completeness, we try to make the similarity clear, starting from 
the classical description of the SECD machine, as found in the original paper [14], 
or in tutorial presentations as [12], with the difference that, consistently with the 
CAM compiler, we assume evaluation from left to right in applications. So the code 
for A-calculus application MN is the code of M followed by the code of N followed 
by 'apply'. The code of a number n! is 'access n\ and the code for constants and 
abstractions is as in the CAM (the code encapsulated in a closure ending with a 
'return', see Section 5). Products are avoided by currying * and +. Table 2 gives 
the rules of the machine, which has a stack of values, an environment component, 
a code component, and a stack where environments are saved (the dump in the 
SECD terminology). 

At this stage, nothing is said about the way of representing environments. The 
notation is not only vague, but even rather contradictory: 'access n y suggests a 
vector, but the operation *v.EV is practically unfeasible as such: only a fixed size 
of memory can be allocated for environments as vectors. Thus we are free to interpret 
the SECD machine in a context where environments are represented as in the CAM. 
Then 'access n' has to be changed into a sequence 'access; fst; . . . ; fst; snd', where 
the role of 'access' is to copy the top of the environment stack to the value stack. 
We reformulate the machine, also simplifying it by considering the E component 
as being the top of the D component, see Table 3. This looks not so different from 
the CAM! To stress even more the similarity we present in Table 4 a two-component 
version of the CAM, where the value component is now the top of the stack. 

It should now be clear that the difference between the two last tables lies only in 
the way of saving environments. The CAM has the conceptually simplest approach 
(which is reflected by the simplicity of the correctness proof, as compared with the 
proof of correctness of the SECD machine [18]). The other approach seems to 
save some stack manipulations (think of expressions MN,JV 2 . . . N„, or 
(. . . (M,, M 2 >, . . . M„», but the price to pay is a more complex machine structure 
(three components against two): a true implementation of two stacks is not so easy. 
Let us mention also that some optimizations of the CAM tend to minimize the 
number of stack manipulations by recognizing when expressions really need their 
environment. 
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Table 2 

The SECD machine 



Configuration 



Stack 


Environment 


Code 


Dump 


S 


E 


(access n);C 


D 


S 


E 


(quote c);C 


D 


S 


E 


(curC);Cl 


D 


v.(C:El).S 


E 


apply;Cl 


D 


n.m.S 


E 


plus;C 


D 


S 


El 


return;C 


E.D 


v M .S 


E 


C 


D 


c.S 


E 


C 


D 


(C:£).5 


E 


CI 


D 


S 


v.E\ 


C;C1 


E.D 


m + n.S 


E 


C 


D 


S 


E 


c 


D 



v is the nth element of E. 



Table 3 

The SECD machine, revisited 



Stack 


Code 


Dump 


Stack 


Code 


Dump 


S 


access; C 


s.D 


s.S 


C 


s.D 


U t).S 


fst;C 


D 


s.S 


C 


D 


(5, D.S 


snd;C 


D 


t.S 


C 


D 


S 


(quote c);C 


D 


c.S 


C 


D 


S 


(curC);Cl 


s.D 


(C:s).S 


CI 


s.D 


t.(C:s).S 


apply;Cl 


D 


S 


C;C1 


{s,t).D 


n.m.s 


plus;C 


D 


m + n.S 


C 


D 


S 


return C 


t.D 


S 


C 


D 



Table 4 
CAM revisited 



Stack Code Stack Code 



(s,t).S 


fst;C 


*.S 


C 


(s,t).S 


snd;C 


t.S 


C 


s.S 


(quote c);C 


c.S 


C 


s.S 


(curC);Cl 


(C:s).S 


CI 


s.S 


push;C 


s.s.S 


c 


t.s.S 


swap;C 


s.t.S 


c 


t.s.S 


cons;C 


U t).S 


c 


t.{C:s).S 


app;Cl 


u, us 


C;C1 


n.m.S 


plus;C 


m + n.S 


C 
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There exists another interpretation of the abstract description of the SECD 
adopted for example in Cardelli's Functional Abstract Machine [5]. He keeps the 
environment-as-vector point of view; his solution in the 'apply' rule is to keep v in 
the stack, and to create environments-as-vectors only when a closure has to be built. 
This entails a distinction between local and global variables, which are accessed in 
the stack, in the vector respectively. The efficiency of his method as compared with 
ours is clear for access times and function application, while closure building is his 
most expensive operation. This cost becomes really apparent when running highly 
functional programs, or implementing lazyness. On the other hand, our actual 
implementation represents the top level environment with a symbol table, so that 
the access time problem concerns only the local environments, which in practice 
are of small size. 



3. Proofs (for the core of the CAM) 

Establishing the correctness of our machine amounts to formally justify that it is 
both like a reduction machine, and like a 'De Bruijn' machine, i.e. a device 
performing /3-reductions in the De Bruijn's notation. More precisely we want to prove 

(1) that the CAM stops with empty code and stack if and only if the evaluated 
term reduces in innermost strategy to the term of the final configuration, using the 
rules of Section 1; 

(2) that the innermost combinatory evaluation of a term stops if and only if 
its call-by-value evaluation by De Bruijn's j8-reduction stops, and that the final 
De Bruijn's term realizes the final combinatory term in a sense which we define 
below. 

Since we tackle formal proofs, we need a formal definition of our different calculi. 
We shall call De Bruijn's calculus the set DBC of terms defined as follows (using 
grammar notation): 

DBC ::= nl\(DBC, DBC)\fst(DBC)\snd(DBC)\S(DBC, DBC)\A(DBC). 

Notice that unlike ML (see Section 5) we consider here fst, snd as operators of 
arity 1: this allows for an easier compilation. De Bruijn's expressions are denoted 
by M,7V,M, 

The De Bruijn's expressions can be considered as categorical terms if n ! (called 
variable), fst(DBC), snd(DBC)) and S(DBC, DBC) are taken as short-hands for 
Snd o Fst", Fst o DBC, Snd ° DBC and App ° (DBC, DBC), resp. 

We propose to describe the rewrite strategy in the form of a deductive system in 
the style of [19]. The system is defined on a set of questions and answers. Questions 
have the form M?t> where M is a De Bruijn's expression and v is a value; values 
are certain categorical terms (we shall be more precise later, but for our first 
proposition we do not care). Answers are values. A state is either a question or an 
answer (the rules apply if no one before may apply): 
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C 

A(M)?t>^* A(M)v 

(n + l)!?(»„ v 2 )fn\lv 1 
0!?(u,,t> 2 )-^t> 2 

M1v±* A(MJv x N?v+>v 2 
c c 

S(M, N)?u-^M,?(t; 1 ,t) 2 ) 

M?^^ N?t>-*»u 2 
c c 

S(M, N)7t)-i* ApMwi,^) 

c 

c c 

<M,N>?t>-^(t> 1)t > 2 ) 

M?«- i »(i> 1) « 2 ) 

c 

fst{M)lv^ Vl 

M?i>-*» w, 
c 

M1v-^(v u v 2 ) 

snd{M)1v^v 2 

c 

c 

snd(M)'iv Snd(vi) 

c 

S,-i*S 2 s 2 ±>s 3 
c c 

The second rule accounts for the rule 'cur' of the CAM: A(M) as a De Bruijn's 
expression acts on v and yields A{M)v which is now a value. 

Now we formalize the compilation of Section 2. We call CAM the compiling 
function: 

CAM(n + 1 !) = fst; CAM(nl), 
CAM(0!) = snd, 

CAM(A(M)) = (cur CAM(M)), 

CAM(S(M, N)) = push C4M(M);swap;G4M(/V);cons; app, 
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CAM((M, N)) = push CAM(M);swap;C4M(JV);cons, 

CAM(fst{M)) = CAM{M)M, 

CAM(snd(M)) = C4M(M);snd. 

Clearly CAM is injective, so that we may write 

M = CAM~\C) \iCAM{M) = C. 

We shall need to add rules to the CAM to be able to build the normal form in the 
term part when the machine stops, since it may stop with nonempty code, as it 
stands. For example, when executing fst( A (Snd)) (our untyped setting allows this), 
the machine stops with code 'fst' and term (cur snd). So we duplicate the rules for 
'fst', 'snd' and 'cons', with the convention that the new ones apply only when the 
corresponding three 'active' rules defined in Section 2 cannot apply. We slightly 
rephrase the machine in order to avoid heaviness in the correctness statement, see 
Table 5. Also we shall write {s, C, S}-»{s,, CI, S,} when the machine moves from 
a configuration to another by one of its rules. Now the correctness of the CAM relies 
on the following. 

Proposition 3.1. For any De Bruijn's expression M and any value v, the following are 
equivalent: 

(1) Mlv — »* Vi\ 

(2) {v, CAM(M), []}-*>{»i, [], []}. 

Proof. In both directions the following remark is useful: if 

{j,C,S}-^{j 1j C1,S,}, 
then for any C2 and S 2 

{s, C@C2, S@S 2 }-Ms,, C1@C2, S,@S 2 }. 

To show (1)=»(2) we also show (la)=>(2a) where 
(la) M?t>-»& M,?i>i; 

(2a) {v, CAM(M), []}Mvu CAM(M,), []}. 



Table 5 



(M) 


fst;C 


S 


s 


C 


s 


s 


fst;C 


S 


Fsts 


C 


s 


(s,0 


snd;C 


s 


t 


C 


s 


s 


snd;C 


s 


Snds 


C 


s 


s 


(cur C); CI 


s 


A(CAM~\C))s 


CI 


s 


s 


push;C 


s 


s 


C 


s.S 


t 


swap;C 


s.S 


s 


C 


IS 


t 


cons;C 


s.S 


uo 


c 


s 


(A(M)s,t) 


app;Cl 


S 


(*,') 


CAM(M)\C\ 


s 


s 


app;C 


S 


Apps 


c 


s 
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The proof is a routine inspection of the different cases of the deduction system. In 
the other direction the property to be proved is that if the CAM started as described 
in the statement stops after n steps, then the final configuration has the shape shown 
in (2), and (1) is true, which is again a routine induction on n. □ 

The second proof is more involved, and requires entering in the detail of states. 

The environment order of a De Bruijn's expression M is the maximal difference, 
when nonnegative, n + 1 - m where n ! ranges over variable occurrences of M and 
m is the number of A's above the concerned occurrence in the term. A De Bruijn's 
expression is closed when its environment order is 0. 

We restrict the set of values to be the set of terms defined as follows: 

- abstraction: yl(M)((), «„_,,..., v 0 ) is a value (more specifically an abstraction 
value) if M is a De Bruijn's expression of environment order at most n and 
t> 0 , . . . , f„-i are values, 

- couple: (vj, v 2 ) is a value if u,, v 2 are values, 

- first projection: Fst v if v is a value which is not a couple, 

- second projection: Snd v if v is a value which is not a couple, 

- application: App(t>,, v 2 ) if v 2 are values and t>, is not an abstraction value. 

We restrict the set of questions by allowing only M?((), t>„_i v 0 ) where M is 

a De Bruijn's expression of environment order at most n and «„_!,..., v 0 are values. 

These definitions are justified by the following facts (left to the reader): 

- values are normal forms w.r.t.— ** (whence the terminology at the beginning of 
the section), 

- if M?u is a state and M?v — ** s for some 5, then s is a state. 

To be able to compare combinatory computations with computations by /3- 
reductions, we must be able to recover De Bruijn's expressions from states. This is 
performed by the function REAL defined as follows: 

REAL(MW, »„-i, • • • , t> 0 )) = M[REAL(v 0 ); REALM, /?HAL(t)„_,)], 

REAL(A(M)({), «„_„. . . , v 0 )) = REAHA(M)W, »„-i t> 0 )), 

REAL((v lt v 2 )) = (REAL(v x ), REAL(v 2 )), 
REAL{Fst v) = fst(REAL(v)), 
REAL(Snd v) = snd(REAL(v)), 
REAL{App{v x , v 2 )) = S(J?£AL(t>,), REAL(v 2 )). 

The notation in the first line denotes the substitution: for M of environment order 
at most n, and closed N 0 , TV,, ... , N„- x (notice that REAL(s) is closed), 
M[N 0 ; N n ^] is defined by 

i![7V 0 :...;7V n _,] = 7V i ifO«i«n-l, 

(A.M)[7V 0 ; . . . ; 7V n _.) = A.(M[0!; 7V 0 ; . . . , ; 7V n _,]), 

the other cases are mere distribution. 
Notice that if M is closed, then REAL{M1()) = M. 
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Now we have to specify the call-by-value evaluation strategy for the A-calculus 
in De Bruijn's notation. We only care for closed expressions. The normal forms 
w.r.t. the strategy are the abstractions, the couples of normal forms, the projections 
of normal forms which are not couples, and the A-calculus applications of normal 
forms the first of which is not an abstraction. They are called A-calculus values, 
and denoted by V, V, . . . 

The strategy is specified much like the combinatory strategy, by the following 
deductive system —*% ('B' stands for De Bruijn), having a unique axiom: 

B 

M^*V, N+>V 2 

B B 

M-^A(Mi) N-**V 2 

B B 

S(M,N)^*M l [V 2 ] 

B 

M-^-V, N-^V 2 
S(M,N)^S(V l ,V 2 ) 

M^(V U V 2 ) 
fst(M)^V i 

M+> V 

B 

fst(M)-**fst(V) 

B 

Mf(V lt V 2 ) 
snd(M)-** V 2 

B 

M V 

B 

snd(M)±>snd(V) 

B 

Now we can state the second proposition of the section. 

Proposition 3.2. The following are equivalent, for any closed M: 

(1) M-** B V, 

(2) M?() -*% v and REAL(v) = V. 

Proof. To prove (2)=>(1) we establish (2a)=S>(la) where 
(2a) s^* c t, 

(la) REAL(s)^%REAL(t). 
The only interesting case is the first rule with S. We may suppose 

REALiMlv)-** REAL(A(M,)?(0, f?" 1 «>?)) 
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and 

REAL(Nv) REAL(v 2 ). 

Now setting V decorated with the corresponding subscript (and superscript) for 
the realization of v, we have 

REAL{Mlv) ■** yl(Mi[0!; V?; . . . ; Vp 1 ]) and REAL(Nlv) V 2 . 

B B 

Hence we know from the definition of - 1 * 

b 

S(.REAL{M1v), REAliNlv)) 

= REAL(S(M, N)fv) ^ (M,[0!; V?; . . . ; VJ-»])[ V 2 ] 

= M,[ V 2 ; V?; . . . ; V?" 1 ] = «£AL(M 1 ?((), i;?" 1 , « 2 )) 

(the last but one equality is easily checked by induction on M,). 

To get (2)=»(1) from (2a)=*(la) we only have to check that for any v REAL(v) 
is a value, which is obvious by definition. 

To show (l)=i>(2) first we establish that if REAL(s) = V, then s-»£ v for some 
v. This is non trivial only if s = Mlv and one proceeds by induction on M, using 
(2a)=»(la) for the application, while for M = n! we notice that obviously 
n!?((), v , t> 0 )-»£ v„. Then showing (1)=>(2) reduces to (lb)=»(2b) where 

(lb) M^* B N, 

(2b) if REAL(s) = M, then s ->£ t for some t s.t. REAL(t) = JV which is proved 
by induction on the cases of -»£ much like (2a)=»(la). □ 

This puts an end to our proof theoretical incursion. 



4. Extending the machine to handle conditionals, recursion and lazy evaluation 

We turn our attention to recursive function definitions. Here is the unavoidable 
factorial function: 

letrec fact n = if n = 0 then 1 else n * fact(n - 1) in fact 1;; 

We turn it into a A-expression using a fixed-point functional Y (which we suppose 
to be a constant rather than defined, because we seek an efficient evaluation): 

(Ag.gl)(y(A/n. if n=0 then 1 else n *f{n-l))). 
Or rather, to benefit from the optimization in Section 2, 

let g = Y(A/n. if n = 0 then 1 else n *f(n - 1)) in gl. 
So there are two new constructions to examine. 
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The code for if M then N else P is 'push' followed by the code of M, followed 
by 'branch (CI, C2)\ where CI, C2 are bodies for N, P, with the following effect: 

branch: replace the term by the top of the stack, and, according to whether the 
term is 'true' or 'false', execute CI or C2. 

For explaining our implementation of recursion we shall start from the fixed-point 
operator Y of the A-calculus, obeying the well-known rule 

YM = M(YM) or [YMJ = App ° (lM]|,|[yM]]>. 

This suggests to introduce a new unary combinator Fix, where Fix(C) is the 
abbreviation of App ° ([ Y\, C). Using one or another coding of the Y combinator 
in the pure A-calculus, we derive 

(Fix) Fix(C) = App°(C,Fix(C)). 

We first concentrate on recursive functional definitions, so we can assume that the 
argument of Fix has the form A(A(M)). We get 

Fix(A(A(M))) = App o (A(A(M)), Fix(A(A(M)))) 

= A(M)° (Id, Fix(A (A ( M)))> 

and, abbreviating Fix(A(A(M))) to F(M), 

F(M) = A(M)°(Id,F(M)). 

What should the corresponding instruction 'F' of the machine look like? If C is 
the code of M, the effect of F(C) should be the same as executing the sequence 
'push; F(C); cons; (cur C)'. Let t be the result of the action of F(C) on the term 
s; t should also be obtained using the alternative sequence of instructions, forcing 

t=C:(s,t) 

or in figure: 

/ 

But such an instruction does not quite look like a real machine instruction. The 
usual way of winding structures is to destructively replace in a couple one of its 
elements by whatever you want (cf. LISP's 'rplac' instruction). Moreover this 
treatment is not easily extendable to simultaneous recursive definitions nor to 
non-functional recursive definitions. 
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So we decompose the effect of F(C) into: first construct an object 



C 




X 



() 



s 



where () is a dummy value, second construct the previous looping structure through 
a new instruction 'wind' which has the following effect: 



wind: physically replace the right part of the top of the stack (supposed to be a 
couple) by the value, and remove the top of the stack. 

F(C) is now compiled into 

push; quote (); cons; push; (cur C); wind 

Summarizing, we have augmented the machine with the conditionals and recursion 
given in Table 6. 

We run our example ('vv'.'fe' stand for 'wind' and 'branch'. B and C are the codes 
corresponding to if n = 0 then 1 else n *f(n - 1) and n *f(n - 1)). 



0 



(s = ({),B:s),0) 
(* = ((),B:s),0) 



false 



(s = ((),B:s),l) 
(* = ((), B:*),l) 
(* = ((),*:*),« 
(«-((),*:*),!) 



(B:(* = ((),B:s)),l) 
( S = ((),B:s),l) 
(s = ((),B:s),l) 



0 

(0.0) 

B:(0.0) 

B:(s = ((),*:*)) 

((),B:(* = ((),*:*))) 

B:(s = ((),B:5)) 



1 



«'())<A(B)w><0!,'l)App 

yl(B)>v><0!,'l>App 

w)(0!, 'l>/ipp 

><0!, 'l)App 

0!, '\)App 

,'l)App 

)App 

App 

«0!,'0> -6('1,C) 
'0>=ft('l, C) 
C) 

<0!,<l!,<0!,'l>->App>* 

<l!,<0!,'l>->App>* 

<0!,'l>->App>* 

'l>-)App>* 

>App>* 

B>* 

<0!,'0> = *>('!, C)>* 



[] 

[(0,0); 01 
[((),());()] 
[()] 

[(0,B:(s = (Q,B:s)m 
[((),B:(s = ((),Brs)))] 
[B:(s = ((),B:s))] 
[] 
[] 

[1; (* = ((). B:*).l)] 
t(s = ((),B:s),l)] 

[] 
[1] 

[B:(s = ((),B:s));l] 
[l;B:(j = ((), B:s));l] 
[B:(s = ((),JJ:s));l] 
[1] 

[(5 = ((),B:5),0);1] 



Table 6 

Conditionals and recursion 



true (branch (CI, C2));C s.S s C1;C S 

false (branch (CI, C2));C s.S s C2;C S 

s wind;C (t,()).S *[(»,())«-(<,*)] C S 
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(0,0) 

(s = (Q,B:s),0) 
(1,1) 



= f>('l,C)>* 
'!)* 



[1] 

[] 

[] 



[(* = ((), *:*), 0);1] 



1 



[] 



(notice that s = ((), B:s) and ((), B:(s = ((), B:s))) denote the same graph). 

So far we have described only call by value, eager evaluation mechanisms: 
arguments of functions are evaluated completely, components of couples also. Call 
by name, lazy evaluation, as is well known, may avoid useless computations and 
allow the manipulation of infinite structures. 

At the language level, the laziness may be 

- either implicit (guaranteed by the compiler), 

- or achieved by the explicit use of a 'freeze' primitive. 

We follow the second choice here although ML has no such primitive presently. 

As was already remarked by Plotkin [18], the most natural way of implementing 
call by name for functional expressions seems to introduce explicit delay, or freeze 
instructions in the call by value framework, the action of which is quite similar to 
the action of 'cur'. 

So we add to our framework a 'freeze' instruction, acting as follows: 

freeze: replace the term s by the structure (C.s), where C is the code encapsulated 
in the freeze instruction. 

(C.s), which refers to Ms if M has C as code, is called a laze. The introduction of 
lazes modifies the nature of the term component of the machine: the term is not 
necessarily a value, but possibly contains unevaluated expressions, which have to 
be forced to evaluation. Thus the compiler has to insert 'unfreeze' instructions at 
appropriate places. A possible strategy, which we adopt here for simplicity, is to 
insert those instructions before the 'strict' instructions, i.e. those which cannot be 
executed on a laze: 'car', 'cdr', 'app', 'plus*. In this approach we have to take care 
of the possible need of repetitively executing 'unfreeze', yielding the following 
description: 

unfreeze: performs no action (like 'skip' discussed above) unless the term is a 
laze C.s, in which case C is prefixed to the code (including 'unfreeze') and the term 
becomes s. 

The repetitive nature of 'unfreeze' can be avoided by restricting appropriately the 
places where 'freeze' instructions are inserted (specifically in the explicit approach, 
'freeze' can only appear as a component of a couple or an argument of an applica- 
tion). Also the non-repetitive 'unfreeze' has to be inserted differently (specifically 
after 'fst' or 'snd'). This is discussed in detail in [15, 16] (cf. also [9]). 

Table 7 gives a summary for our chosen variant. 
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Table 7 

Lazy evaluation 



s (freeze C);C1 S C.s CI S 

C.s unfreeze;Cl S s C;unfreeze;Cl S 

s unfreeze; C S s C S 



The third rule is applied if the second cannot be applied. 



As an example we execute 

let z = 2 in (\x.z)(freeze((\y.y)l)) 

(/) stands for 'freeze', and B is the code for (\y.y)l 



0 ('2)(A(V.),f(B))App [] 

(0,2) U(U),XB))App [] 

(0,2) AB))App [1 !:((), 2)] 

B.(0,2) )App [1 !:((), 2)] 

(((),2),R((),2) 1! □ 

2 [] [] 



(as expected the useless computation of B has been avoided). 

Our last example shows how recursion and laziness may be combined to handle 
recursively defined lists. We evaluate the following expression, expressed in an ML 
style: 

letrec x = (1, freeze x) in fst(snd(x)) 

The compiler translates fst(M) by: code of M followed by 'unfreeze', followed by 
'fst', thus preparing for forcing delayed evaluations. D stands for '[push; quote 
(); cons, push]'. We abbreviate 'freeze', 'unfreeze' into 'fr', 'unfr', and use '(', ',', ')', 
"' instead of 'push', 'swap', 'cons', 'quote', to save space: 



0 (('0)({'l,f(snd))w)snd unfsnd unffst [] 
(0,0) {{'l,f(snd))w)snd unfsnd unffst [()] 

((),()) f(snd))w)snd unfsnd unffst [1; ((),());()] 

(l,snd.((),())) w)snd unfsnd unffst [((),());()] 

s = (1, snd.({), s)) )snd unf snd unf fst [()] 

s = (l, sn<L<£), s)) unf snd unffst [] 

s = (1, snd.(0, s)) snd unffst [] 

s = jndL((),(l,«))) unffst [] 

s = ((),(l,snd.s)) snd unffst [] 

s = (l,snd.(0,5)) fst [] 

1 □ 
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5. Using the CAM to compile ML 

We present here a compiler for a subset of ML [10] producing CAM code. We 
have decided to make this presentation completely effective by writing it in ML. 
ML programs and CAM programs are made into ML objects and the compiling 
process is made into an ML function. Moreover we also describe the execution of 
the CAM as an ML function. 

This ML description of an ML to CAM compiler happens to be as clear and 
concise as it could be in any other formalism thanks to the concrete types feature 
that has been recently proposed by Milner [17] and added to existing implementa- 
tions of ML [4, 11]. This feature enables one to describe and manipulate object 
languages in ML by means of their abstract syntax (concrete syntax can also be 
used through an interface between ML and Yacc [11]). 

The ML subset we have chosen to take into account in the following description 
is of course the result of a compromise between significance and space. It seemed 
reasonable to omit concrete and abstract type declaration since they mainly concern 
the type-checker and not the compiler. Our most serious omissions are references 
and assignments and also exceptions. What remains is 

- integers and booleans together with arithmetic, equality tests and conditionals, 

- A-calculus together with let and letrec constructions (abstractions and let's are 
allowed w.r.t. patterns). 

Here is out ML subset: 

type rec MLexp = 

mlplus| . . . |mlequal|mlfst|mlsnd 
|mlint of int 
|mlbool of bool 
|mlvar of string 

|mlcond of MLexp # MLexp # MLexp 

|mlpair of MLexp # MLexp 

|mlin of MLdec# MLexp 

|mlabstr of MLpat# MLexp 

|mlapp of MLexp # MLexp 
and MLdec = 

mllet of MLpat # MLexp 

|mlletrec of MLpat# MLexp 
and MLpat = 

nullpat 

|varpat of string 

Ipairpat of MLpat# MLpat;; 

The three concrete types MLexp, MLdec and MLpat have been denned. They 
correspond respectively to ML expressions, ML declarations and ML patterns. Now 
we have to give a similar description of what CAM instructions are. It is given 
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below together with the definition of CAM values: 

type rec instruction = plus| . . . |eq 
| quote of value 
|fst|snd|cons|wind 
|push|swap|return|app 
| cur of code 

|branch of code # code 
and value = nullvalue 

|int of int 

|bool of bool 

| pair of value # value 

|closure of code # value 
and code = = instruction list;; 

To make our machine more realistic compared to the description given in previous 
sections, we shall save return addresses (pointers to code) on the stack. So the stack 
elements will be either values or addresses. Also a 'return' instruction is added. 
Table 8 gives the modification of the CAM. 

type stackelem = val of value 

| cod of code;; 
type stack = = stackelem list;; 
type config = = value # code # stack;; 

Now we describe the CAM as an ML function Exec with type 'config-* config'; 
defined by pattern matching on configurations: 

let rec Exec = fun 

(pair(x, y), (fstr.C), D) -> Exec(x, C, D) 
|(pair(x, y), (snd::C), D) -* Exec(y, C, D) 
|(x,(cons::C),(val.y)::D)) -» Exec(pair(y, x), C, D) 
|(x, (wind::C), ((val(pair(j>, z)) as u)::D)) (z:= x); Exec(w, C, D) 
\(x, (push:: C), D) Exec(x, C, (val x)::D) 
\(x, (swap:: C), ((val y)::D)) -> Exec(y, C, (val x)::D) 
|(T, ((quote v)::C),D) Exec(«, C, D) 

|(pair(closure(x, y), z), (app::C), D -» Exec(pair(v, z), x, (cod C)::D) 
|((boo b), ((branch(Cl, C2))::C), ((val x)::D)) 
-+ Exec(x, (if b then CI else C2), (cod C)::D) 



Table 8 

Saving code on the stack 



(Cl:s,f) app.C S (s, t) CI C.S 

s return C.S s C S 
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|((pair(int m, int n)), (plus::C), D) Exec(int(m + n), C, D) 
|... 

|((pair(int m, int n)), (eq::C), D) -» Exec(bool(m = n), C, D) 
|(x,((curCl)::C),D) -> Exec(closure(Cl, x), C, D) 
\(x, (return::C), ((cod C')::D)) — Exec(x, C, D) 
|config — » config;; 

We owe the reader an explanation for the use of an assignment in the 'wind' case. 
Since components of concrete objects are non-assignable in ML, we should have 
made value into a reference type. But this would have complicated the description 
of all the other cases where no destructive operations are used. So we cheated a bit. 

Finally we give the compiling function. Since ML uses real identifiers and not 
De Bruijn codings for these, our compiling function will have to deal with the 
translation of a variable to some access code that will find at run time its value in 
the environment. Thus the compiling function has an extra parameter which is a 
pattern giving the position of the free variables of the expression to be compiled in 
the environment. The translation of a variable will be taken into account by the 
auxiliary function 'access' defined by: 

let rec access id = fun 
nullpat — * fail 
|(varpat x) -» if x = id then [] else fail 

Kpairpat (xl, x2)) -» snd :: (access id x2)?(/sr:: (access id xl));; 

and the compiling function is the following: 

let rec Compile pat — fun 
(mlint n) — > [quote (int «)] 
|(mlbool b) -* [quote (bool b)] 
|(mlvar v) — » access t; pat 

|(mlcond (El, E2, £3)) -» [push]® (Compile pat £1) 

©[branch ((Compile pat E2)@ [return] 
(Compile pat E3)@[return])] 
|(mlpair(£l, £2)) -> [push]® (Compile pat £1)@ [swap] 

©(Compile pat £2)@[cons] 
|(mlin(mllet(P, £1),£2)) [push]® (Compile pat El)@[cons] 

©(Compile pat' E2) 
where pat' = pairpat (pat, P) 
|(mlin (mlletrec(P, £1), £2)) — * [push; quote nullvalue; cons; push] 

©(Compile pat' £l)@[swap; wind] 
©(Compile pat' £2) 

where pat' = pairpat (pat, P) 
|(mlabstr(i > , £)) -> [cur ((Compile pat' £)@[return])] 
where pat' = pairpat (pat, P) 
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|(mlapp(£l, £2) -» if is_constant El 

then (Compile pat £2)@(trans_constant £1) 
else [push]@(Compile pat £l)@[swap] 
©(Compile pat £2)@[cons; app] 

\E — * if is_constant E 

then [cur (snd::(trans_constant £))] 
else fail;; 

where the functions 'is_constant' and 'trans_constant' are defined in the following 
way: 

let is_constant e = mem e [mlplus; mlequal; mlfst; mlsnd];; 
let trans_constant = 

fun mlplus -» [plus]|mlequal -> [eq]|mlfst -» [fst]|mlsnd -» [snd];; 

If we want to implement a lazy variant of ML, the changes we have to make to 
this implementation are minor ones. Let us allow laziness by introducting in ML 
an explicit 'freeze' operator as well as a freeze and an unfreeze instruction in the 
CAM: 

type rec MLexp 

mlfreeze of MLexp 

|... 

type rec instruction = 

|freeze of instruction list 

|unfreeze 

I... 
and value 

fro/en of instruction list # value 

I... 

The behaviour of the CAM for these new features is described by the following rules: 
let rec Exec = fun 

|(T, ((freeze C1)::C),D) -* Exec (frozen(Cl, T), C, D) 
|(frozen(Cl, Tl), (unfreeze: :C), D) 

-* Exec (Tl, CI, (cod (unfreeze::C))::D) 
|(T, (unfreeze: :C), D) -* Exec(T, C, D) 

Now the Compile function must translate 'mlfreeze E' to 'freeze C* where C is the 
translation of E and also put unfreeze instructions where needed. These unfreeze 
instructions are needed when an arithmetic operation or a selector function or 'app' 
is to be executed but the part of the value that has to be unfrozen is not the same 
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in the different cases. To apply 'fst' or 'snd' we must have a pair but its components 
can be frozen. To apply 'app' we must have a pair the first component of which is 
a closure. Finally, to apply an arithmetic operation, we must have a pair of integers. 
So the Compile and trans_constant functions are modified into: 

let rec Compile pat E = fun 

|(mlfreeze E) —* [freeze ((Compile pat £)@[return])] 
I... 

|(mlapp (£1, £2)) — > if is.constant E 1 

then (Compil E2)@(trans_constant £1) 
else [push]@(Compil El)@[swap] 

©(Compile pat E2)@[cons]@unfa@[app] 
where unfa = [push;snd;swap;fst;unfreeze;cons] 
I-.- 

let trans_constant = 

let unfb = [unfreeze;push;snd;unfreeze;swap;fst;unfreeze;cons] 
in 

fun mplus — * unfb@[plus]|mlminus — » unfb@[minus] 
|mltimes -* unfB@[times]|mlequal — » unfb@[eq] 
|mlfst — * [unfreeze;fst]|mlsnd — * [unfreezejsnd];; 

Notice that we have put 'unfreeze' instructions only where they are really needed. 
For example we know that the fst's and snd's that have been produced by the 
function 'access' to access the environment will never have to operate on frozen 
values and therefore we have not to accompany them with 'unfreeze'. Similarly, the 
instruction 'app' will never have to operate on a frozen value but always on a pair 
and so the only requirement is to unfreeze the left part of this pair. 

6. Conclusion 

The Categorical Abstract Machine arises very naturally from the semantic descrip- 
tion of functional programming languages. It can be seen as a kind of SECD machine 
where the management of environments is made explicit through graph structures. 
It is simpler and easier to prove correct than the traditional presentations of the 
SECD machine. It can easily incorporate lazy evaluation. Moreover we have shown 
that a functional programming language such as ML compiles very naturally to 
CAM code. 

Besides its 'theoretical' qualities, we think that the CAM can be used to build 
conceptually simple and reasonably fast implementations of functional languages 
on conventional architectures. Of course, as already mentioned in our discussion 
at the end of Section 2, the global environment has to be suitably represented. The 
simplicity of the compiler, and the existence of intuitively clear rewrite rules allow 



The categorical abstract machine 



201 



a formal approach to code optimization, including the application of curried func- 
tions to their successive arguments, the detection of closed functional expressions, 
for which no closures are needed, the detection of more situations where the Beta 
optimization could be used (like in "let fx = M in . . . fa . . .")• A combination of the 
last optimizations allows a very efficient implementation of recursive function 
definitions as loops in the code rather than by creating a looping environment. 
These optimizations are described in [6, 14], and even more details will be found 
in [20]. 
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